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Abstract 

We prove the following results concerning the list decoding of error-correcting codes: 

1. We show that for any code with a relative distance of S (over a large enough alphabet), the 
following result holds for random errors: With high probability, for a p ^ 5 ^ e fraction of 
random errors (for any e > 0), the received word will have only the transmitted codeword 
in a Hamming ball of radius p around it. Thus, for random errors, one can correct twice 
the number of errors uniquely correctable from worst-case errors for any code. A variant 
of our result also gives a simple algorithm to decode Reed-Solomon codes from random 
errors that, to the best of our knowledge, runs faster than known algorithms for certain 
ranges of parameters. 

2. We show that concatenated codes can achieve the list decoding capacity for erasures. A 
similar result for worst-case errors was proven by Guruswami and Rudra (SODA 08), 
although their result does not directly imply our result. Our results show that a subset of 
the random ensemble of codes considered by Guruswami and Rudra also achieve the list 
decoding capacity for erasures. 

Our proofs employ simple counting and probabilistic arguments. 
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1 Introduction 



List decoding is a relaxation of the traditional unique decoding paradigm, where one is allowed to 
output a list of codewords that are close to the received word. This relaxation allows for designing 
list decoding algorithms that can recover from scenarios where almost all of the redundancy could 
have been corrupted |18^ [HI \T5\ [6]. In particular, one can design binary codes from which one can 
recover from a 1/2 — e fraction of errors. This fact has lead to many surprising applications in 
complexity theory- see e.g. the survey by Sudan [T^ and Guruswami's thesis [H Chap. 12]. 

The results mentioned above mostly deal with worst-case errors, where the channel is considered 
to be an adversary that can corrupt any arbitrary fraction of symbols (with an upper bound on the 
maximum fraction of such errors). In this work, we deal with random and erasure noise models, 
which are weaker than the worst-case errors model, and which also have interesting applications in 
complexity theory. 

1.1 Random Errors 

It is well-known that for worst-case errors, one cannot uniquely recover the transmitted codeword if 
the total number of errors exceeds half the distance. (We refer the reader to Section [2] for definitions 
related to codes.) List decoding circumvents this by allowing the decoder to output multiple nearby 
codewords. In situations where the decoder has access to some side information, one can prune the 
output list to obtain the transmitted codeword. In fact, most of the applications of list decoding in 
complexity theory crucially use side information. However, a natural question to ask is what one 
can do in situations where there is no side information (this is not an uncommon assumption in 
the traditional point-to-point communication model). 

In such a scenario, it makes sense to look at a weaker random noise model and try to argue 
that the pathological cases that prevent us from decoding a code with relative distance 6 from more 
than S/2 fraction of errors are rarely encountered. 

Before we move on, we digress a bit to establish our notion of random errors. In our somewhat 
non-standard model, we assume that the adversary can pick the location of the p fraction of error 
positions but that the errors themselves are random. For the binary case, this model coincides 
with worst-case errors, so in this work, we consider alphabet size q ^ 3. We believe that this is 
a nice intermediary to the worst-case noise model and the more popular models of random noise, 
where errors are independent across different symbols. Indeed, a result with high probability in our 
random noise model (for roughly p errors) immediately implies a similar result for a more benign 
random noise model such as the q-ary symmetric noise channel with cross-over probability /90 For 
the rest of the paper, when we say random errors, we will be referring to the stronger random noise 
model above. 

Related Work. The intuition that pathological worst-case errors are rare has been formalized 
for certain families of codes. For example, McEliece showed that for Reed-Solomon codes with 
distance 5, with high probability, for a fraction p ^ 5 — e of random errors, the output list size 
is one [T^g Further, for most codes of rate 1 — Hq{p) — e, with high probability, for a p fraction 
of random errors, the output list size is one. (This follows from Shannon's famous result on the 

^In this model, every transmitted symbol remains untouched with probabihty 1 — p and is mapped to the other 
g — 1 possible symbols with probability p/{q — 1). Finally, the noise acts independently on each symbol. 
^The actual result is slightly weaker: see Section [3] for more details. 
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capacity of the g-ary symmetric channel: for a proof, see e.g. [IZ].) It is also known that most 
codes of rate 1 — Hq{p) — e have relative distance at least p. Further, for q ^ 2^'^^/^-', it is known 
that such a code cannot have distance more than p + e: this follows from the Singleton bound and 
the fact that for such an alphabet size, 1 — Hq{p) ^ 1 — p — e (cf. [TBI Sec 2.2.2]). 

Our Results. In our first main result, we show that the phenomenon above is universal, that 
is, for every q-aiy code, with q ^ 2^^^/^\ the following property holds: if the code has relative 
distance 6, then for any p ^ 6 — e fraction of random errors, with high probability, the Hamming 
ball of fractional radius p around the received word will only have the transmitted codeword in 
it. We would like to point out three related points. First, our result implies that if we relax the 
worst-case error model to a random error model, then combinatorially one can always correct twice 
the number of errors. Second, one cannot hope to correct more than a 5 fraction of random errors: 
it is easy to see that, for instance, for Reed-Solomon codes, any error pattern of relative Hamming 
weight p > 5 will give rise to a list size greater than one. Finally, the proof of our result follows 
from a fairly straightforward counting argument. 

A natural follow-up question to our result is whether the lower bound of 2^^^/^^ on q can be 
relaxed. We show that if q is 2°^^^'^\ then the result above is not true. This negative result 
follows from the following two observations/results. First, it is known that for any code with rate 
1 — Hq{p) + e, the average list size, over all possible received words, is exponential. Second, it is 
known that Algebraic-Geometric (AG) codes over alphabets of size at least 49 can have relative 
distance strictly bigger than 1 — Hq{p) (cf. [10]). However, these two results do not immediately 
imply the negative result for the random error case. In particular, what we need to show is that 
there is at least one codeword c such that for most error patterns e of relative Hamming weight p, 
the received word c -|- e has at least one codeword other than c within a relative Hamming distance 
of p from it. To show that this can indeed be true for AG codes, we use a generalization of an 
"Inverse Markov argument" from Dumer et al. [1]. 

A Cryptographic Application. In addition to being a natural noise model to study, list de- 
coding in the random error model has applications in cryptography. In particular, Kiayias and 
Yung have proposed cryptosystems based on the hardness of decoding Reed-Solomon codes [TT| . 
However, if for Reed-Solomon codes (of rate -R), one can list decode p fraction of random errors 
then the cryptosystem from can be broken for the corresponding parameter settings. Since 
Guruswami-Sudan can solve this problem for p ^ 1 — \fR for worst-case errors [8], Kiayias and 
Yung set the parameter p > 1 — ^/R. Beyond the 1 — ^/R bound, to the best of our knowledge, 
the only known algorithms to decode Reed-Solomon codes are the following trivial ones: (i) Go 
through all possible q^ codewords and output all the codewords with Hamming distance of p from 
the received word; and (ii) Go through all possible (^) error locations and output the codeword, 
if any, that agrees in the (1 — p)n "non-error" locations. 

It is interesting to note that each of the three algorithms mentioned above work in the stronger 
model of worst-case errors. However, since we only care about decoding from random errors, one 
might hope to design better algorithms that make use of the fact that the errors are random. In 
this paper, we show that (essentially) the proof of our first main result implies a related result that 
in turn implies a modest improvement in the running time of algorithms to decode Reed-Solomon 
codes from p > 1 — y/R fraction of random errors. The related result states the following: for any 
code with relative distance 6 (over a large enough alphabet) with high probability, for a p fraction 
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of random errors, Hamming balls of fractional radius 5 — e around the received word only have the 
transmitted codeword in themU Note that unlike the statement of our result mentioned earlier, 
we are considering Hamming balls of radius larger than the fraction of errors. This allows us to 
improve the second trivial algorithm in the paragraph above so that one needs to verify fewer "error 
patterns." This leads to an asymptotic improvement in the running time over both of the trivial 
algorithms for certain setting of parameters, though the running time is still exponential and thus, 
too expensive to break the Kiayias-Yung cryptosystem. 

1.2 Erasures 

In the second part of the paper, we consider the erasure noise model, where the decoder knows the 
locations of the errors. (However, the error locations are still chosen by the adversary.) Intuitively, 
this noise model is weaker than the general worst-case noise model as the decoder knows for sure 
which locations are uncorrupted. This intuition can also be formalized. E.g., it is known that for 
a p fraction of worst-case errors, the list decoding capacity is 1 — Hq(p), whereas for a p fraction 
of erasures, the list decoding capacity is 1 — p (cf. [U Chapter 10]). Note that the capacity for 
erasures is independent of the alphabet size. As another example, for a linear code, a combinatorial 
guarantee on list decodability from erasures gives a polynomial time list decoding algorithm. By 
contrast, such a result is not known for worst-case errors. 

As is often the case, the capacity result is proven by random coding arguments. A natural quest 
then is to design explicit linear codes that achieve the list decoding capacity for erasures, and is an 
important milestone in the program of designing explicit codes that achieve list decoding capacity 
for worst-case errors. This goal is the primary motivation for our second main result. 

Our Result and Related Work. For large enough alphabets, explicit linear codes that achieve 
list decoding capacity for erasures are not hard to find: e.g., Reed-Solomon codes achieve the 
capacity. For smaller alphabets, the situation is much different. For binary codes, Guruswami 
presented explicit linear codes that can handle p = 1—e fraction of erasures with rate O ^ iog(i/e) ) El' 
For alphabets of size 2*, 1 — e fraction of erasures can be list decoded with explicit linear codes of 



of 1 — /9 is still a lofty goal. (In fact, breaking the rate barrier for polynomially small e would 
imply explicit construction of certain bipartite Ramsey graphs, solving an open question [3].) 

To gain a better understanding about codes that achieve list decoding capacity for erasures, a 
natural question is to ask whether concatenated codes can achieve the list decoding capacity for 
erasures. Concatenated codes are the preeminent method to construct good list decodable codes 
over small alphabets. In fact, the best explicit list decodable binary codes (for both erasures [3] and 
worst-case errors [7]) are concatenated codes. Briefly, in code concatenation, an "outer" code over 
a large alphabet is first used to encode the message. Then "inner" codes over the smaller alphabet 
are used to encode each of the symbols in the outer codeword. These inner codes typically have a 
much smaller block length than the outer code, which allows one to use brute-force type algorithms 
to search for "good" inner codes. Also note that the rate of the concatenated code is the product 
of the rate of the outer and inner codes. 

similar result was shown for Reed-Solomon codes by McEliece [14]. 




Thus, especially for binary codes, an explicit code with capacity 
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Given that concatenated codes have such a rigid structure, it seems plausible that such codes 
would not be able to achieve list decoding capacity. For the worst-case error model, Guruswami 
and Rudra showed that there do exist concatenated codes that achieve list decoding capacity [5]. 
However, for erasures there is an additional potential complication that does not arise for the worst- 
case error case. In particular, consider erasure patterns in which p fraction of the outer symbols 
are completely erased. It is clear by this example that the outer code needs to have rate very close 
to 1 — /9. However, note that to approach list decoding capacity for erasures, the concatenated code 
needs to have rate 1 — p — e. This means that the inner codes need to have rate very close to 1. By 
contrast, even though the result of [5j has some restrictions on the rate of the inner codes, it is not 
nearly as stringent as the requirement above. (The restriction in [5] seems to be an artifact of the 
proof, whereas for erasures, the restriction is unavoidable.) Further, this restriction on the inner 
rate is just by looking at a specific class of erasure patterns. It is reasonable to wonder if when 
taking into account all possible erasure patterns, we can rule out the possibility of concatenated 
codes achieving the list decoding capacity for erasures. 

In our second main result, we show that concatenated codes can achieve the list decoding 
capacity for erasures. In fact, we show that choosing the outer code to be a Folded Reed-Solomon 
code ([6]) and picking the inner codes to be random independent linear codes with rate 1, will with 
high probability, result in a linear code that achieves the list decoding capacity for erasures. We 
show a similar result (but with better bounds on the list size) when the outer code is also chosen 
to be a random linear code. Both of these ensembles were shown to achieve the list decoding 
capacity for errors in [5], although, as mentioned earlier, the result for errors holds for a superset 
of concatenated codes (as the inner codes could have rates strictly less than 1). The proof of our 
result is similar to the proof structure in [5j . Because we are dealing with the more benign erasure 
noise model, some of the calculations in our proofs are much simpler than the corresponding ones 
in [5]. 

Approximating NP Witnesses. We conclude this section by pointing out that an application 
of binary codes that are list decodable from erasures is to the problem of approximating NP- 
witnesses [H [12]. For any NP-language L, we have a polynomial-time decidable relation Rl{-^ •) 
such that X G -L if and only if there exists a polynomially sized witness w such that Rl{x,w) 
accepts. Thus, for an NP-complete language we do not expect to be able to compute the witness w 
in polynomial time given x. A natural notion of approximation is the following: given an e fraction 
of the bits in a a correct witness w, can we verify if x S L in polynomial time? The results in [2lll2j 
show that such an approximation is not possible unless P=NP. 

To be more precise, Gal et al. ([2]) consider the following problem: given a SAT formula (j) 
over n variables can we, in polynomial time, compute another SAT formula (p' over N = poly(n) 
variables such that given eN bits from a satisfying assignment to (f)', we can compute a satisfying 
assignment to the original formula (j)? 

Kumar and Sivakumar's (|12j) reduction works for any NP-language L. However, their reduction 
computes a polynomial-time computable relation R'j^ (with witness size N = poly(n)), which is 
different from the original predicate Rl such that the knowledge of eN many bits of some satisfying 
witness for R'^ can be used in polynomial time to compute a satisfying witness for R'j^. Both of 
these results are proven by picking a linear binary code C that can be list decoded from a 1 — e 
fraction of erasures and "encoding" C{x) (where x is the input) into the definition of </)' (in the case 
of [2]) or R'j^ (in the case of [E]). The intuition behind these reductions is that given sufficiently 
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many bits of a satisfying witness, we can obtain a list of potentially satisfying witnesses by running 
the list decoding algorithm for C to recover from the erasures. (The connection to list decoding 
was implicit in [2]- it was made explicit in |12j.) 

Guruswami and Sudan ([9]) show that the reductions above can be made to work with e = 
A^~^/^"^'^ for the Kumar and Sivakumar problem and with e = A^"i/^+')' for the Gal et al. problem 
(for any constant 7 > 0). An explicit linear code that meets the list decoding capacity for erasures 
will improve the value of e above to A'^~^+'>' and A^~i/2+i', respectively. 

Organization of the Paper. We begin with some preliminaries in Section [2l We present our 
first main result on random codes in Section [3] and our second main result on erasures in Section [H 

2 Preliminaries 

For an integer m ^ 1, we will use [m] to denote the set {!,..., m}. 

Basic Coding Definitions. A code C of dimension k and block length n over an alphabet T, 
is a subset of S" of size The rate of such a code equals k/n. Each n-tuple in C is called a 

codeword. Let ¥q denote the field with q elements. A code C over ¥q is called a linear code if C 
is a subspace of . In this case the dimension of the code coincides with the dimension of C as 
a vector space over Fg. By abuse of notation we can also think of a linear code C as a map from 
an element in F^ to its corresponding codeword in F^, mapping a row vector x € F^ to a vector 
xG € via a k X n matrix G over Fg which is referred to as the generator matrix. 

The Hamming distance between two vectors in x, y € S", denoted by A(x, y), is the number 
of places they differ in. The (minimum) distance of a code C is the minimum Hamming distance 
between any two distinct codewords from C. The relative distance is the ratio of the distance to 
the block length. 

We will need the following notions of the weight of a vector. Given a vector v G {0, 1, . . . , q—1}^, 
its Hamming weight, which is the number of non-zero entries in the vector, is denoted by wt(v). 
Given a vector y = (yi, . . . , y„) G {0, . . . , g — !}"■ and a subset 5" C [n], ys will denote the subvector 
{Uiji&s-, and WTs'(y) will denote the Hamming weight of ys- 

Code Concatenation. Concatenated codes are constructed from two different types of codes 
that are defined over alphabets of different sizes. If we are interested in a concatenated code over 
Fg, then the outer code Cout is defined over Fg, where Q = for some positive integer fc, and 
has block length N . The second type of codes, called the inner codes, and which are denoted 
by Cj^jj, . . . , C;^, are defined over Fg and are each of dimension k (note that the message space 
of Cl^ for all i and the alphabet of Cout have the same size). The concatenated code, denoted 
by C = Cout o (C'i^n' • • • ' ^hi )' is defined as follows: Let the rate of Cout be R and let the block 
lengths of Cl^ be n (for 1 ^ z ^ A^). Define K = RN and r = k/n. The input to C is a vector 
m = (mi, . . . , niK) S (Fg)-'^. Let Cout(m) = (xi, . . . ,xn). The codeword in C corresponding to m 
is defined as follows 

C(m) = {CI{x^),CI{x2),...,C^^{xm)). 

The outer code Cout in this paper will either be a random linear code over Fg or the folded 
Reed-Solomon code from [6]. In the case when Cout is random linear, we will pick Cout by selecting 
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K = RN vectors uniformly at random from ¥q to form the rows of the generator matrix. For 
every position 1 ^ i ^ A^, we wih choose an inner code C^^ to be a random Hnear code over ¥g 
of block length n and rate r = k/n. In particular, we will work with the corresponding generator 
matrices Gj, where every Gj is a random k x n matrix over Fg. All the generator matrices Gj (as 
well as the generator matrix for Cout, when we choose a random Cout) are chosen independently. 
This fact will be used crucially in our proofs. 

List Decoding. We define some terms related to list decoding. 

Definition 1 (List decodable code for errors). For < p < 1 and an integer L ^ 1, a code C C 
is said to he {p,L)-list decodable if for every y € S", the number of codewords in C that are within 
Hamming distance pn from y is at most L. 

Given a vector c = (ci, . . . , Cn) G and an erased received word y = (i/i, . . . , y„) G (IlU{?})"jl| 
we will use c ~ y to denote the fact that for every i S [n] such that yi 7^?, q = yi. With this 
definition, we are ready to define the notion of list decodability for erasures. Further, for an erased 
received word, we will use WT(y) to denote the number of erased positions. 

Definition 2 (List decodable code for erasures). For < p < 1 and an integer L ^ 1, a code 
C C is said to be {p, L)ied-list decodable if for every y G (S U {?})" with WT(y) ^ pn, the 
number of codewords c € C such that c c^y is at most L. 

Reed-Solomon and Related Codes. The classical family of Reed-Solomon (RS) codes over a 
field F are defined to be the evaluations of low-degree polynomials at a sequence of distinct points 
of F. Folded Reed-Solomon codes are obtained by viewing the RS code as a code over a larger 
alphabet F* by bundling together s consecutive symbols for some folding parameter s. We will not 
need any specifics of folded RS codes (in fact, even their definition) beyond certain properties that 
we recall in Section [H 

3 Random Errors 

In this section we consider the random noise model mentioned in the introduction: the error 
locations are adversarial but the errors themselves are random. Our main result is the following. 

Theorem 1. Let < e, 5 < 1 6e reals and let q and n ^ i}{l/e) be positive integers. Let T, = 
{0, 1, . . . , (7 — 1} Jl Let < p ^ 6 — e be a real. Let C be a code over S of block length n and relative 
distance 5. Let S C [n] with \S\ = (1 — p)n. Then the following hold: 

(a) If q ^ 2^(1/^), then for every codeword c and all but a q ^(^"■) fraction of error patterns 
e G S" with WT(e) = pn and WT5(e) = 0, the only codeword within the Hamming ball of 
radius pn around the received word c + e is c. 

^? denotes an erasure. 

^We will assume that S is equipped with a monoid structure, i.e. for any a,b £ T,, a + b £ T, and is the identity 
element. 
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(b) Let 7 > 0. If q > max ^?7,, |^ ^ j , then for every codeword c and all hut a {q — 

l)-((i-7)e/2-(i-<5)7)n fraction of error patterns e G S" with WT(e) = pn and WTs{e) = 0, the 
only codeword within the Hamming ball of radius {6 — e)n around the received word c + e is 
c. 

A weaker version of Theorem [1] was previously known for RS codes [13] • (Though the bounds 
for part (b) are better in |14j.) In particular, McEliece showed Theorem [1] for RS codes but over 
all error patterns of Hamming weight pn. In other words, Theorem [1] implies the result in [H] if 
we average our result over all subsets S C [n] with \S\ = pn. 

Part (a) of Theorem [1] implies that for e ^ {6 — e)n random errors, with high probability, the 
Hamming ball of radius e has one codeword in it. Note that this is twice the number of errors for 
which an analogous result can be shown for worst-case errors. Part (b) of Theorem [1] implies the 
following property of Reed-Solomon codes (where we pick e = 4i? and 7 = 1/2). 

Corollary 2. Let k ^ n < q be integers such that q > (f Then the following property holds for 
Reed-Solomon codes of dimension k and block length n over ¥q. For at least 1 — q~^^^') fraction of 
error patterns e of Hamming weight at most n — Ak and any codeword c, the only codeword that 
agrees in at least 4fc positions with c + e is c. 

We would like to point out that in Corollary [21 the radius of the Hamming ball can be larger 
than the number of errors. This can be used to slightly improve upon the best known algorithms 
to decode RS codes from random errors beyond the Johnson bound for super-polynomially large q. 
See Section [XT] for more details. 

A natural question is whether the lower bound of q ^ 2^^^/^^ in part (a) of Theorem [1] can be 
improved. In Section 13.21 we show that this is not possible. 

Proof of Theorem [Tj. Let c G C be the transmitted codeword. For an a ^ 1 — 6 + e, we call an 
error pattern e (with WT(e) = pn and WTs{e) = 0) a-bad if there exits a codeword c' ^ c £ C such 
that A(c + e, c') = (1 — a)n (and every other codeword has a larger Hamming distance from c + e). 
We will show that the number of a-bad error patterns (over all a ^ 1 — 5 + e) is an exponentially 
small fraction of error patterns e with WT(e) = pn and WT5'(e) = 0, which will prove the theorem. 

Fix a ^ l — 6+£. Associate every a-bad error pattern e with the lexicographically first codeword 
c' ^ c £ C such that A(c + e, c') = (1 — a)n. Let A C [n] be the set of positions where c' and 
c + e agree. Further, define Sq = S Ci A, Si = ACi {[n] \ S) and /3 = |5o|/n. Thus, for every a-bad 
error pattern e, we can associate such a pair of subsets {Sq, Si) Q S x ([n]\S). Hence, to count the 
number of a-bad error patterns it suffices to count for each possible pair (50,6*1), with j^ol = /5n 
and 15*11 = (a — /3)n for some a — p ^ /3 ^ a, the number of a-bad patterns that can be associated 
with it. (The lower and upper bounds on (3 follow from the fact that Si C [n] \ 5 and 5o ^ A, 
respectively.) 

Fix sets Sq S and 5i C [n]\5 with |5o| = /3n and |5i| = {a — j3)n for some a — p ^ (3 ^ a. To 
upper bound the number of a-bad error patterns that are associated with (5o,5i), first note that 
such error patterns take all the (g — 1)(/'~"+^)'^ possible values at the positions in [n] \ (5u5i). Fix a 
vector X of length n — |S| — |5i| and consider all the a-bad error patterns e such that e[^]\(5u5j) = x. 
Recall that each error pattern is associated with a codeword c' 7^ c such that c' and c + e agree 
exactly in the positions S*o U 6*1. Further, such a codeword c' is associated with exactly one a-bad 
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error pattern e, where ej^j^^gy^^) = x. (This is because fixing d fixes es^ and is already fixed 
by the definition of S.) Thus, to upper bound the number of a-bad error patterns associated with 
(S'o,S'i), where e[„]\^(5uSi) = ^ (call this number Na^So,Si,x), we will upper bound the number of 
such codewords c'. Note that as C has relative distance 5n, once any (1 — 5)n + l positions are fixed, 
there is at most one codeword that agrees with the fixed positions (if there is no such codeword 
then the corresponding "error pattern" does not exist). Thus, there is at most one possible c' once 
we fix (say) the "first" (1 — 5)n + 1 — I^qI values of eg^ (recall that c'g^ = cs^)- This implies that 

Let Ma be the number of choices for (50,5"!), which is just the number of choices for A. As the 
number of choices for x is {q — l)(p-"+/')'"^ the number of a-bad error patterns is at most 

Ma-{q- !)(''-"+/')" • {q - l)(l-'5-/3)n+l =Ma-{q- l)(l-'5-")"+l . (^ _ i)P". (i) 



Proof of part (a). Note that the number of a-bad patterns for any a ^ 1 — 6+e is upper bounded 
by 

M„ • {q - l)-^"+l . {q - 1)P". 

We trivially upper bound by 2". Recalling that there are {q — 1)^" error patterns e with 
WT(e) = pn and WT5(e) = and that a can take at most n values, the fraction of a-bad patterns 
(over all a ^ 1 — /9 ^ 1 — 5 + e) is at most 

n2^(g - l)-^"+i <^{q- i)(-^+i^^i(l^+^)'^ ^ (g _ i)-W3 ^ ^-^n/e^ 

where the first inequality follows from the fact that n ^ 2", the second inequality is true for n ^ 3/e 
and q ^ 2^/^ and the last inequality follows from the inequality {q — 1) ^ y/q (which in turn is true 
for g ^ 3). 

Proof of part (b). Note that M„ = ) ^ (e/a)"". Thus, the number of a-bad error patterns 
is upper bounded by 

(g_l)(^"^""+"'l'o|(q-"))'^+l.(q_l)P« ^ (g_l)(l-5-Q(l-7))«+l.(g_l)pn ^ (q_l)(-(l-7)e+7(l-<5))n+l.(^^_]^^pn^ 

/ \ 1/7 

where the inequalities follow from the facts that q > f j and a ^ 1 — 6 + e. Recalling that 

there are {q — 1)''" error patterns e with WT(e) = pn and WT5(e) = and that a can take at most 
n values, the fraction of a-bad patterns (over all a ^ 1 — 5 -|- e) is at most 

n{q - l)(-(l-7)£+(l-'5)7)n+l ^ _ l)(-{l-7)e+7{l-5)+f )n ^ _ (" ^^+7(l-'5)) 

where the first inequality follows from the fact that q > n and the second inequality is true for 
n^4/((l-7)e). ■ 
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3.1 An Implication of Corollary [2] 

To the best of our knowledge, for e > n — \/fcn, the only known algorithms to decode Reed-Solomon 
(RS) codes from e random errors are the trivial ones: (i) Go through all possible codewords and 
output the closest codeword- this takes 2^^^'^°^'^^ ■ n time and (ii) Go through all possible (") error 
locations and check that the received word outside the purported error locations is indeed a RS 
codeword- this takes 20(('^-e) iog(n/(n-e))) . ^(^^2) ^-^^^^ 

If e ^ n — Ak, then by Corollary [21 we can go through all the (^) choices of subsets of size 
4/c and check if the received word projected down to the subset lies in the corresponding projected 
down RS code. This algorithm takes 2'^('^^°s("/'^)^ • 0(?i^) time, which is better than the trivial 
algorithm (ii) mentioned above for e in n — uj{k). Further, this algorithm is better than the trivial 
algorithm (i) when q is super-polynomially large in n. 

3.2 On the Alphabet Size in Theorem [1] 

It is well-known that any code that is (p, L)-list decodable that also has rate at least 1 — Hq{p) + e 
needs to satisfy L = (cf. [3]). A natural way to try to show that part (a) of Theorem [1] is 

false for q ^ 2°^^/^) is to look at codes whose relative distance is strictly larger than 1 — Hq{p). 
Algebraic-geometric (AG) codes are a natural candidate since they can beat the Gilbert- Varshamov 
bound for an alphabet size of at least 49 (cf. [10]). The only catch is that the lower bound on L 
follows from an average case argument and we need to show that over most error patterns, the list 
size is more than one. For this we need an "Inverse Markov argument," like one in [T]. 
(The argument above was suggested to us by Venkat Guruswami.) 

We begin with the more general statement of the "Inverse Markov argument" from [1]. (We 
thank Madhu Sudan for the statement and its proof.) 

Lemma 3. Let G = {L,R,E) be a bipartite graph with \L\ = and \R\ = nf(. Let the average left 
degree of G be denoted by di. Note that the average right degree is = "^'^^ . Then the following 
statements are true: 

(i) If we pick an edge e = (u, v) uniformly at random from E, then the probability tha^ d{v) ^ edji 
is at most e. 

(ii) If G is d-left regular then consider the following process: Uniformly at random pick a vertex 
u G L. Then uniformly at random pick a vertex v € R in u's neighborhood. Then the 
probability that d{v) ^ ^-^ cLt most e. 

Proof . We first note that (ii) follows from (i) as the random process in (ii) ends up picking edges 
uniformly at random from E. 

To conclude, we prove part (i). Consider the set R' C R such that v £ R' satisfies d{v) ^ edR. 
Note that that the maximum number of edges that have an end-point in R' is at most edR-UR = e\E\. 
Thus, the probability that a uniformly random edge in E has an end point in R' is upper bounded 
by ejE'l/l-El = e, as desired. ■ 

The following is an easy consequence of Lemma [3] and the standard probabilistic method used 
to prove the lower bound for list decoding capacity. 

®For any vertex v, we denote its degree by d{v). 
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Lemma 4. Let q ^ 2 and ^ p < 1 — 1/q. Then the following holds for large enough n. Let 
C C {0, . . . , (7 — 1}" he a code with rate 1 — Hq{p) + 7. Then there exists a codeword c G C such 
that for at least a 1 — q~^^"''^') fraction of error patterns e of Hamming weight at most pn, it is true 
that the Hamming ball of radius pn around c + e has at least two codewords from C in it. 

Proof. Define the bipartite graph Gc,p = (C, {0, . . . ,q — 1}'^,E) as follows. For every c £ C, 
add (c,y) S E such that A(c,y) ^ pn. Note that Gc,p is a Volq(/9n)-left regular bipartite graph, 
where Volg(r) is the volume of the Q'-ary Hamming ball with radius r. Note that the graph has an 
average right degree of 

where in the above we have used the following well known inequality (cf. |13j): 

Yolgipn) ^ qHMn-o(n)_ 

Thus, by part (b) of LemmaO (with e = {(Ir)^^ ^ qi-7"+o(n)^^ have 

Pr Pr [c + e has at most one codeword within Hamming distance pn] ^ (7~t"-+°(") . 

ceCee{0,...,g-l}" 
WT(e)sgpn 

Thus, there must exist at least one codeword c £ C with the required property. ■ 

Thus, given Lemma HI we can prove that part (a) of Theorem [1] is not true for a certain value 
of q if there exists a code C C {0, . . . ,q — l}" with relative distance 5 such that it has rate at least 
1 — Hg(6 — e) + 7 for some 7 > 0. Now it is known that for fixed a > 0, Hq{a) ^ q + {j^^^ 

(cf. j20|, Lecture 7]). Thus, we would be done if we could find a code with relative distance 6 and 
rate at least 

1- (5 + 6 + 7 -0(1/ log q)- 

For q ^ 2"^^^^^ , the bound above for small enough e is upper bounded by 1 — 5 — e— "^T^rj (assuming 
that 7 = 0(e)). It is known that AG codes over alphabets of size ^ 49 with relative distance 5 
exist that achieve a rate of 1 — 5 — ■ Thus, for 49 ^ q ^ 2°^^/^^, AG codes over alphabets of 
size q are the required codes. 



4 Concatenated Codes 

This section first shows that with folded Reed-Solomon codes and independently chosen small 
random linear inner codes, the resulting concatenated code can achieve erasure capacity in a list 
decoding setting. A similar result holds when the outer code is a random linear code, and this 
result is presented second. 



4.1 Folded Reed-Solomon Outer Code 

Theorem 5. Let q be a prime power and let < R ^ 1 be an arbitrary rational number. Let 
n,K,N ^ 1 be large enough integers such that K = RN . Let Cout be a folded Reed-Solomon code 
over Fgn of block length N and rate R. Let C^, • • • , be random linear codes over ¥q, where Cf^ is 
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generated by a random nxn matrix Gj over ¥q and the random choices for Gi, . . . , Gat are all inde- 
pendent^Then the concatenated code C* = Cout°{Cl^, . . . , C^) is a (l — R — e, [^)^^^ i°s(i/^)) 



led 



list decodable code with probability at least 1 — q ^("^) over the choices of Gi, . . . ,GAr. Further, 
C* has rate R w.h.p. 

To set up the proof of the theorem above, we begin by collecting certain definitions and results 
from [5] . The following notion of independence will be crucial. 

Definition 3 (Independent tuples). Let C be a code of block length N and rate R defined over 
Fgfe. Let J ^ 1 and ^ di, . . . ,dj ^ N be integers. Let d = {di, . . . , dj). An ordered tuple of 
codewords {c^, . . . , c^), c-' ^ C is said to be (d, ¥q) -independent if the following holds, di = wt(c-'^) 
and for every 1 < j ^ J , dj is the number of positions i such that is ¥ q-independent of the 
vectors {c\, . . . , c^""*^}, where = {c{, . . . , c^j^). 

Note that for any tuple of codewords (c-*^, . . . ,0"^) there exists a unique d such that it is (d,Fg)- 
independent. The next two results will be crucial in the proof of our second main result. 

Lemma 6 ([5])' -^^^ e > and let C be a folded Reed-Solomon code of block length N and rate 
< R < 1 that is defined over ¥q, where Q = ■ For any L-tuple of codewords from C , where 
L ^ J ■ (A^/e^)'^(^ J^ozii/R)) ^ there exists a sub-tuple of J codewords such that the J -tuple is 
{d,¥q) -independent, where d = {di, . . . ,dj) with dj ^ {\ — R — e)N , for every 1 ^ j ^ J. 

Lemma 7 ([5]). Let C be a folded Reed-Solomon code of block length N and rate < R < 1 that 
is defined over ¥q, where Q = q^. Let J ^ 1 and ^ di,...,dj ^ N be integers and define 
d = {di, . . . ,dj). Then the number of (d,¥q) -independent tuples in C is at most 

J 

^NJ{J+l) J-j- gmax(d3-Af(l-ii)+l,0) 

i=i 

Given the outer code Cout and the inner codes C^^, recall that for every codeword u = 

(ui, . . . , uat) G Cout, the codeword uG (uiGi, U2G2, . . . , unGn) is in C* = CoutoiCl^, C^^), 
where the operations are over ¥q. 

We now begin with the proof. The fact that C* has rate R w.h.p. follows the argument used 
in [5] and is omitted. 

Define Q = q^. Let L be the worst-case list size that we are aiming for (we will fix its value 
at the end). By Lemma [U any L + l-tuple of Cout codewords (u*^, . . . ,u^) G (Cout)'^^^ contains 
at least J = (L + l)/(Ar/72)0(7-'Jiog(g/i?)) codewords that form a (d, Fg)-independent tuple, for 

some d = (di, . . . , dj), with dj ^ (1 — R — 'y)N for all 1 ^ j ^ J (we will specify 7, < 7 < 1 — i2, 
later). Thus, to prove the theorem it suffices to show that with high probability, there is no received 
word y e (Fg U {?})"-^ with WT(y) ^ {I - R - e)nN and J-tuple of codewords (u^G, . . . , u-^G), 
where (u"*^, . . . , u"') is a J-tuple of folded Reed-Solomon codewords that is (d, Fg)-independent, 
such that u*G ~ y for every 1 ^ i ^ J. For the rest of the proof, we will call a J-tuple of Cout 
codewords (u^, . . . , u"^) a good tuple if it is (d, Fg)-independent for some d = (di, . . . , dj), where 
dj^{l-R- -i)N for every 1 ^ j ^ J. 



^We stress that we do not require that the G^'s have rank n. 
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Define p = 1 — R — e. Note that by the union bound, we need to show that 

yeCF^ui?})"^ 

WT(y )^pn7V 



(2) 



where 



good (ui,...,uJ)e(Cout)-' 



Au^G 



,i=l 



For now fix a good tuple (u^, . . . that is (d = {di, . . . , dj), Fg)-independent. Define sets 
Si C [N] {\Si\ = di) to be the positions that are "witnesses" to the fact that (u^,...,u'^) is 
(d, Fq)-independent. 

Then the probability that a particular codeword matches the unerased positions of the received 
word is: 



Pr[u^G~y] ^Pr[KG)5, =^y5j. 



(3) 



Further, the latter probability in inequality ^ is independent of the probability for any j ^ i- 
To see this, let Ei be the event that (u*G)si — y Si- 
Then note that: 



Pr 



Pr 



l\Ei\Ei 



.1=2 



•Pr[^i]. 



As (u^, . . . , u"') is a good tuple, this is simply: 

J 



Pr 



A^. 

i=2 



Using induction, we get that the probability that all messages in the list match is just the 
product of the individual probabilities. Thus, we have: 



Pr 







A u'G ~ y 


^ Pr 


.i=l 





,j=i 



nPr[KG)5. ^y^J. 



i=l 



If we let Ui be the number of unerased g-ary symbols in y5., then since all the G, are independent 
random matrices: 



Pr[(u»G)5, ^ y^J = ^ 

Note that the reason that {—din + pnN) ^ —ui is because in the worst case, all erasures occur 
in Si. 

We take a union bound over the number of different ways that the di can occur: 



J \ J 

Af J( J+1) JJ Q max(0,di-JV(l-i?)) | "Tf „-din+pnJV 



{l-R-'y)N^di,d2,- ,dj<N \ i=l 
The bound in parenthesis in inequality (jH) comes from Lemma [71 



(4) 
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Now since 

max (0, d^-{l- R)N) i^d^-{l-R- j)N, 
we can rewrite this, collapsing the two products into one, as: 

= ^ fqNJ{J+l)Yj^^nid,~il-~R~-f)N)~d,n+pnN\ 

But since: 

ndi — ndi = 0, 

we can rewrite this again, replacing the sum with an upper bound, as 
Note that: 

So for n ^ (J+ l)/7: 

qNJ{J+l) ^ ^JnN-y^ 

Note also that the total number of possible received words can be bounded as follows: 

pnN J 

where the first term in the product on the left-hand side of inequality ([B]) is the number of ways 
to choose erasure locations, and the second term is the number of ways to choose symbols in the 
unerased positions. 

Also, 

for large enough N . 

After applying these bounds, we get that: 

Pr[C* is not (p, L)ied] ^ g2nAf^Jn7V(p-l+fl+37)_ (7) 

Recall that we have R = 1 — p — e and can choose J and 7 freely. 
Setting 

J ^1/7 

will make 



nN < JnN-r 



and in particular, 
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^2nN ^ ^JnAr(27)_ 

If we pick 7 = e/10, then our final error probability in inequality d?]) will be: 



Pr[C* is not {p,L)ied] 



^JnN 



establishing the desired error bound. 

Remark 1. It is easy to see that the rate of the inner codes have to be very close to 1. To see 
this consider the erasure pattern where p fraction of the outer codeword symbols are completely 
erased. To recover from such a situation, we need R to be close to 1 — p. One could re-visit the 
proof above for general r and try to figure out how far away from 1 r can be. If we had r < 1 
then in ([5]), the exponent within the product should read rn{di — (1 — R — 7)) — diU + pnN. We 
ultimately need R* = Rr = 1 — p — e. Using this and some manipulations, the exponent becomes 
(1 — r) (1 — di/N) — e + r7. The only thing that we can guarantee about di is that di ^ {1 — R — ^)N . 
If we desire the ultimate error probability to be q-^i^^^J) ^ then the proof goes through only if 
rR ^R-0{e). 

4.2 Random Linear Outer Code 

Theorem 8. Let q he a prime power and letO < R ^ 1 be an arbitrary rational. Let n,K,N ^ 1 be 
large enough integers such that K = RN . Let Cont be a random linear code overYqn that is generated 
by a random K x N matrix over ¥qn. Let Cj^, . . . , be random linear codes over ¥q, where Cj^ 
is generated by a random n x n matrix Gj and the random choices for Cout, Gi, . . . , Gtv are all 
independent. Then the concatenated code C* = Couto(C'in) • • • ) C-^) is a (l — R — e, J -list 



with high probability, C* has rate R. 

Proof. Let q ^ 2 and -R* = -R be the rate of the outer code (the inner codes are chosen so that 
their dimension k = n, and therefore have rate 1). 

We define a segment of a codeword in C* as a sequence of consecutive g-ary symbols generated 
by one particular inner code. An assumption that we will make for the ease of analysis (and which 
we will remove later) is that erasures, which occur with relative rate p, will be equally distributed 
among the concatenated codeword segments. This means that in our received word y, the result 
of each of the N inner code encodings will contain at most pn erasures. 

We will show that there exists some integer L such that any subset of L + 1 distinct encoded 
messages has the property that they all match the non-erased segments of the received word with 
low probability. Then we'll apply the union bound to show that with high probability, the code 
meets the list decoding capacity for erasures. 

Define Q = and p = 1 — R* — e. Let J = [logQ(L + 1)J . Then there exists a subset of size 
at least J of our list (which is of size L + 1) such that the set of messages {mi, m2, ...mj} will be 
linearly independent over Fg. This is because there are only Q"^ unique ways to form linear sums 
of these messages over ¥q . 

Because of this fact and because Cout is a random linear code, the set {Cout (mi), Co^t (012), • • • , Cout (mj)} 
can be treated as a set of independently chosen random vectors in F^. . 




decodable code with probability at least 1 — q 



-n{nN) 



over the choices of Cout, Gi, . . . , Gtv- Further, 
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Fix an s so that 1 ^ s ^ and let represent a particular segment of our received word. 
(There are such segments over Fg). In our list of J outer encoded messages, we denote by i 
the size of the subset of these where for each outer encoded message Couf(mt), restricted to the 
segment s, Cout{'<^t) is the zero vector. J — i is then the number of messages such that Cotiti^t) 
is not the zero vector when restricted to the segment s. 

We can bound the probability that each of these messages match the received word at this 
segment, in the unerased positions, as follows: 

Pr[(C*(m,)).^y.]^ (^y(l-^y \^-(i-pM^-). (g) 

In the above, the relationship {C*{mt))s — Us means that the concatenated code, on message 
mt, restricted to segment s, matches the received word y on segment s at all unerased positions. 

If {Cout{^t))s = Oi then we just assume that (C*(mj))s ~ y^, so this is an upper bound, and 
not an equality. 

The first term in the RHS of ([8]) is the probability that i messages at this segment map to the 
zero vector, and the second term is the probability that J — i messages map to something other 
than the zero vector. 

The third term is the probability that those nonzero J — i messages match the received word 
in every unerased position. 
Now since 




we have that 

Pr[(C*(mt)), ^ y,] < g-^-^W . q^il-py 
Also, because (1 — p) is always less than 1, 

qi{l-p)n-in ^ -|^_ 

Therefore 



Vv[{C*{mt))s^ys]<.q 



~(l~p)nj 



The probability, then, that every message in the list matches the received word in the unerased 
positions for a single segment, taken over all possible choices of locations and sizes of i is then (by 
the union bound over such locations and sizes, noting that there are at most ways to make these 
choices): 



Pr 



- Vs 



.i=l 



(9) 



Recalling that each inner code is chosen independently, the probability that this is true for all 
segments is then 
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Pr 



/\C*I 



JN 



-{l-p)nJN 



lt=l 



Taking the union bound over all possible received words and lists of size J: 

Pr[C* is not (p, L)i,d] ^ ^""^ • g^^"'')"^ • g'^^'^ • g-^^ • q-^^-p)^^^ . (10) 

The first term in RHS of (|10p is an upper bound on the number of possibilities for the erasure 
positions. The second term is the number of ways to specify the unerased positions, the third term 
is the number of possible lists of size J, and the fourth and fifth terms come from the previous 
inequality. 

Since kK = R*nN, and 2 > 1 + (1 — p), this can be rewritten and simplified as: 

Pr[C* is not {p,L)i,,] ^ 
If we can choose n, R* , and J appropriately so that: 

^-R*-- + {l-p)> e/2, 
J n 

then this probability will be exponentially small. 
Setting n ^ J, J = [|] works. 

We still need to fix the assumption that the p fraction of erasures are all distributed equally 
among the encoded segments. 

Note that if we describe the fraction of erasures in each segment by p^, then 

N 



Psn = pnN. 



s=l 



The per-segment probability then becomes 



-{l-ps)nj 



and the probability for the entire received word becomes 



Pr 



y 



.t=i 



N 



-(l-ps)nj 



Note further that the ps terms can be collected in the exponent and simplified to inequality Q. 
Finally, the claim that C* has rate R follows from a similar argument to that from [S] and is 
omitted. ■ 
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